Text-directed speech enhancement employing phone class parsing and feature map constrained vector quantization

نویسندگان

  • John H. L. Hansen
  • Bryan L. Pellom
چکیده

There are many situations where non-real-time speech enhancement is required. For such applications, employing any available a priori knowledge can lead to more effective enhancement solutions. In this study, a novel text-directed speech enhancement algorithm is developed for usage in non-real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech into regions of broad phoneme classifications. Classes considered include stops, fricatives, affricates, nasals, vowels, semivowels, diphthongs and silence. These partitions are then used to direct a new vector quantizer based enhancement scheme in which phone-class directed constraints are applied to improve speech quality. The proposed algorithm is evaluated using both objective as well as subjective quality assessment techniques. It is shown that the text-directed approach improves the quality of the degraded speech over a broad range of noise sources (i.e., flat communications channel noise, aircraft cockpit noise, helicopter fly-by noise, and automobile highway noise) and over a broad range of signal-to-noise ratios (i.e., 10, 5, 0 and 5 dB). In each case, the proposed method is shown consistently to exhibit improved objective quality over linear and generalized spectral subtraction, as well as the Auto-LSP constrained iterative enhancement method using the Itakura-Saito measure and a lOO-sentence evaluation speech corpus. Subjective quality assessment was conducted in the form of an A-B comparison test. Results of these evaluations demonstrate that, for wideband noise distortions, the proposed algorithm is preferred over the unprocessed noisy speech more than 2 to 1, while the proposed algorithm is preferred over spectral subtraction by more than 3 to 1.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text-directed speech enhancement using phoneme classification and feature map constrained vector quantization

This paper presents and evaluates a novel text-directed speech enhancement algorithm for usage in non real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech into regions of broad phoneme classiications. Classes considered include stops, fricatives, aaricates, nasals, vowels, semivowels, diphthongs and silence. These partitions are then used ...

متن کامل

Text - Directed Speech Enhancement Employing

There are many situations where non-real-time speech enhancement is required. For such applications, employing any available a priori knowledge can lead to more eeective enhancement solutions. In this study, a novel text-directed speech enhancement algorithm is developed for usage in non-real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech...

متن کامل

Performance Analysis of Speech Enhancement Algorithm for Robust Speech Recognition System

Widely Speech Signal Processing has not been used much in the field of electronics and computers due to the complexity and variety of speech signals and sounds with the advent of new technology. However, with modern processes, algorithms, and methods which can proc Demand for speech recognition technology is expected to their mobile phones as all purpose lifestyle devices. In this paper, an imp...

متن کامل

Feature extraction in opinion mining through Persian reviews

Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels d...

متن کامل

An axiomatic approach to soft learning vector quantization and clustering

This paper presents an axiomatic approach to soft learning vector quantization (LVQ) and clustering based on reformulation. The reformulation of the fuzzy c-means (FCM) algorithm provides the basis for reformulating entropy-constrained fuzzy clustering (ECFC) algorithms. This analysis indicates that minimization of admissible reformulation functions using gradient descent leads to a broad varie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 21  شماره 

صفحات  -

تاریخ انتشار 1997